Non Parametric Methods for Genomic Inference

نویسندگان

  • Peter J. Bickel
  • Nathan Boley
  • James B. Brown
  • Haiyan Huang
  • Nancy R. Zhang
چکیده

Large-scale statistical analysis of data sets associated with genome sequences plays an important role in modern biology. A key component of such statistical analyses is the computation of p-values and confidence bounds for statistics defined on the genome. Currently such computation is commonly achieved through ad hoc simulation measures. The method of randomization, which is at the heart of these simulation procedures, can significantly affect the resulting statistical conclusions. Most simulation schemes introduce a variety of hidden assumptions regarding the nature of the randomness in the data, resulting in a failure to capture biologically meaningful relationships. To address the need for a method of assessing the significance of observations within large scale genomic studies, where there often exists a complex dependency structure between observations, we propose a unified solution built upon a data subsampling approach. We propose a piecewise stationary model for genome sequences and show that the subsampling approach gives correct answers under this model. We illustrate the method on three simulation studies and two real data examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predictive Ability of Statistical Genomic Prediction Methods When Underlying Genetic Architecture of Trait Is Purely Additive

A simulation study was conducted to address the issue of how purely additive (simple) genetic architecture might impact on the efficacy of parametric and non-parametric genomic prediction methods. For this purpose, we simulated a trait with narrow sense heritability h2= 0.3, with only additive genetic effects for 300 loci in order to compare the predictive ability of 14 more practically used ge...

متن کامل

صحت انتخاب ژنومی روش‌های پارامتری و ناپارامتری با معماری‌های ژنتیکی افزایشی و غالبیت

     In most genomic prediction studies only additive effects will be used in models for estimating genomic breeding values (GEBV). However, dominance genetic effects are an important source of variation for complex traits, considering them into account may improve the accuracy of GEBV. In the present  study,  performed applying  simulated data, the effect of  different heritability values (0.1...

متن کامل

استنباط پیشگو ناپارامتری فازی بهینه برای طرح نمونه‌گیری جهت پذیرش یک مرحله‌ای

Acceptance sampling is one of the main parts of the statistical quality control. It is primarily used for the inspection of incoming or outgoing lots. Acceptance sampling procedures can be used in an acceptance control program to reach better quality with lower expenses, improvement of the control and the increase of efficiency. The aims of this paper, studying acceptance sampling based on non-...

متن کامل

Bayesian Nonparametric and Parametric Inference

This paper reviews Bayesian Nonparametric methods and discusses how parametric predictive densities can be constructed using nonparametric ideas.

متن کامل

Assessing significance [JF 20]

With the exception of Bayesian analysis, phylogenetic inference procedures typically identify a best estimate of phylogenetic relationships, a so called point estimate of the phylogeny. However, the point estimate is often relatively uninteresting in itself unless we have some measure of its reliability. This lecture will be about techniques for examining the robustness or significance of the r...

متن کامل

تنظیم و کاربرد الگوریتم جنگل تصادفی در ارزیابی ژنومی

One of the most important issues in genomic selection is using a decent method for estimating marker effects and genomic evaluation. Recently, machine learning algorithms which are members of non-parametric and non-linear methods have been extended to genomic evaluation. One of these methods is Random Forest (RF) on which this research was focused. Important parameters in RF algorithm are the n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009